{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Introduction"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The first part of this [series](http://pbpython.com/excel-pandas-comp.html) was very well received so I thought I would continue the theme of showing how to do common Excel tasks in pandas.\n",
    "\n",
    "In the first article, I focused on common, math tasks in Excel and how to do them in pandas. In this article, I'll focus on some other Excel tasks related to data selection and how to map them to pandas. \n",
    "\n",
    "Please refer to [this post](http://pbpython.com/excel-pandas-comp-2.html) for the full post."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Getting Set Up"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Import the pandas and numpy modules."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Load in the Excel data that represents a year's worth of sales."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = pd.read_excel(\"../data/sample-salesv3.xlsx\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Take a quick look at the data types to make sure everything came through as expected."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "account number      int64\n",
       "name               object\n",
       "sku                object\n",
       "quantity            int64\n",
       "unit price        float64\n",
       "ext price         float64\n",
       "date               object\n",
       "dtype: object"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.dtypes"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You'll notice that our date column is showing up as a generic `object`. We are going to convert it to datetime object to make some selections a little easier."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "df['date'] = pd.to_datetime(df['date'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>account number</th>\n",
       "      <th>name</th>\n",
       "      <th>sku</th>\n",
       "      <th>quantity</th>\n",
       "      <th>unit price</th>\n",
       "      <th>ext price</th>\n",
       "      <th>date</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>740150</td>\n",
       "      <td>Barton LLC</td>\n",
       "      <td>B1-20000</td>\n",
       "      <td>39</td>\n",
       "      <td>86.69</td>\n",
       "      <td>3380.91</td>\n",
       "      <td>2014-01-01 07:21:51</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>714466</td>\n",
       "      <td>Trantow-Barrows</td>\n",
       "      <td>S2-77896</td>\n",
       "      <td>-1</td>\n",
       "      <td>63.16</td>\n",
       "      <td>-63.16</td>\n",
       "      <td>2014-01-01 10:00:47</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>218895</td>\n",
       "      <td>Kulas Inc</td>\n",
       "      <td>B1-69924</td>\n",
       "      <td>23</td>\n",
       "      <td>90.70</td>\n",
       "      <td>2086.10</td>\n",
       "      <td>2014-01-01 13:24:58</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>307599</td>\n",
       "      <td>Kassulke, Ondricka and Metz</td>\n",
       "      <td>S1-65481</td>\n",
       "      <td>41</td>\n",
       "      <td>21.05</td>\n",
       "      <td>863.05</td>\n",
       "      <td>2014-01-01 15:05:22</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>412290</td>\n",
       "      <td>Jerde-Hilpert</td>\n",
       "      <td>S2-34077</td>\n",
       "      <td>6</td>\n",
       "      <td>83.21</td>\n",
       "      <td>499.26</td>\n",
       "      <td>2014-01-01 23:26:55</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   account number                         name       sku  quantity  \\\n",
       "0          740150                   Barton LLC  B1-20000        39   \n",
       "1          714466              Trantow-Barrows  S2-77896        -1   \n",
       "2          218895                    Kulas Inc  B1-69924        23   \n",
       "3          307599  Kassulke, Ondricka and Metz  S1-65481        41   \n",
       "4          412290                Jerde-Hilpert  S2-34077         6   \n",
       "\n",
       "   unit price  ext price                date  \n",
       "0       86.69    3380.91 2014-01-01 07:21:51  \n",
       "1       63.16     -63.16 2014-01-01 10:00:47  \n",
       "2       90.70    2086.10 2014-01-01 13:24:58  \n",
       "3       21.05     863.05 2014-01-01 15:05:22  \n",
       "4       83.21     499.26 2014-01-01 23:26:55  "
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "account number             int64\n",
       "name                      object\n",
       "sku                       object\n",
       "quantity                   int64\n",
       "unit price               float64\n",
       "ext price                float64\n",
       "date              datetime64[ns]\n",
       "dtype: object"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.dtypes"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The date is now a datetime object which will be useful in future steps."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Filtering the data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Similar to the autofilter function in Excel, you can use pandas to filter and select certain subsets of data.\n",
    "\n",
    "For instance, if we want to just see a specific account number, we can easily do that with pandas.\n",
    "\n",
    "Note, I am going to use the `head` function to show the top results. This is purely for the purposes of keeping the article shorter."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>account number</th>\n",
       "      <th>name</th>\n",
       "      <th>sku</th>\n",
       "      <th>quantity</th>\n",
       "      <th>unit price</th>\n",
       "      <th>ext price</th>\n",
       "      <th>date</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>307599</td>\n",
       "      <td>Kassulke, Ondricka and Metz</td>\n",
       "      <td>S1-65481</td>\n",
       "      <td>41</td>\n",
       "      <td>21.05</td>\n",
       "      <td>863.05</td>\n",
       "      <td>2014-01-01 15:05:22</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>307599</td>\n",
       "      <td>Kassulke, Ondricka and Metz</td>\n",
       "      <td>S2-10342</td>\n",
       "      <td>17</td>\n",
       "      <td>12.44</td>\n",
       "      <td>211.48</td>\n",
       "      <td>2014-01-04 07:53:01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34</th>\n",
       "      <td>307599</td>\n",
       "      <td>Kassulke, Ondricka and Metz</td>\n",
       "      <td>S2-78676</td>\n",
       "      <td>35</td>\n",
       "      <td>33.04</td>\n",
       "      <td>1156.40</td>\n",
       "      <td>2014-01-10 05:26:31</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>58</th>\n",
       "      <td>307599</td>\n",
       "      <td>Kassulke, Ondricka and Metz</td>\n",
       "      <td>B1-20000</td>\n",
       "      <td>22</td>\n",
       "      <td>37.87</td>\n",
       "      <td>833.14</td>\n",
       "      <td>2014-01-15 16:22:22</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>70</th>\n",
       "      <td>307599</td>\n",
       "      <td>Kassulke, Ondricka and Metz</td>\n",
       "      <td>S2-10342</td>\n",
       "      <td>44</td>\n",
       "      <td>96.79</td>\n",
       "      <td>4258.76</td>\n",
       "      <td>2014-01-18 06:32:31</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    account number                         name       sku  quantity  \\\n",
       "3           307599  Kassulke, Ondricka and Metz  S1-65481        41   \n",
       "13          307599  Kassulke, Ondricka and Metz  S2-10342        17   \n",
       "34          307599  Kassulke, Ondricka and Metz  S2-78676        35   \n",
       "58          307599  Kassulke, Ondricka and Metz  B1-20000        22   \n",
       "70          307599  Kassulke, Ondricka and Metz  S2-10342        44   \n",
       "\n",
       "    unit price  ext price                date  \n",
       "3        21.05     863.05 2014-01-01 15:05:22  \n",
       "13       12.44     211.48 2014-01-04 07:53:01  \n",
       "34       33.04    1156.40 2014-01-10 05:26:31  \n",
       "58       37.87     833.14 2014-01-15 16:22:22  \n",
       "70       96.79    4258.76 2014-01-18 06:32:31  "
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[df[\"account number\"]==307599].head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You could also do the filtering based on numeric values."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>account number</th>\n",
       "      <th>name</th>\n",
       "      <th>sku</th>\n",
       "      <th>quantity</th>\n",
       "      <th>unit price</th>\n",
       "      <th>ext price</th>\n",
       "      <th>date</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>740150</td>\n",
       "      <td>Barton LLC</td>\n",
       "      <td>B1-20000</td>\n",
       "      <td>39</td>\n",
       "      <td>86.69</td>\n",
       "      <td>3380.91</td>\n",
       "      <td>2014-01-01 07:21:51</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>218895</td>\n",
       "      <td>Kulas Inc</td>\n",
       "      <td>B1-69924</td>\n",
       "      <td>23</td>\n",
       "      <td>90.70</td>\n",
       "      <td>2086.10</td>\n",
       "      <td>2014-01-01 13:24:58</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>307599</td>\n",
       "      <td>Kassulke, Ondricka and Metz</td>\n",
       "      <td>S1-65481</td>\n",
       "      <td>41</td>\n",
       "      <td>21.05</td>\n",
       "      <td>863.05</td>\n",
       "      <td>2014-01-01 15:05:22</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>737550</td>\n",
       "      <td>Fritsch, Russel and Anderson</td>\n",
       "      <td>B1-53102</td>\n",
       "      <td>23</td>\n",
       "      <td>71.56</td>\n",
       "      <td>1645.88</td>\n",
       "      <td>2014-01-04 08:57:48</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>239344</td>\n",
       "      <td>Stokes LLC</td>\n",
       "      <td>S1-06532</td>\n",
       "      <td>34</td>\n",
       "      <td>71.51</td>\n",
       "      <td>2431.34</td>\n",
       "      <td>2014-01-04 11:34:58</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    account number                          name       sku  quantity  \\\n",
       "0           740150                    Barton LLC  B1-20000        39   \n",
       "2           218895                     Kulas Inc  B1-69924        23   \n",
       "3           307599   Kassulke, Ondricka and Metz  S1-65481        41   \n",
       "14          737550  Fritsch, Russel and Anderson  B1-53102        23   \n",
       "15          239344                    Stokes LLC  S1-06532        34   \n",
       "\n",
       "    unit price  ext price                date  \n",
       "0        86.69    3380.91 2014-01-01 07:21:51  \n",
       "2        90.70    2086.10 2014-01-01 13:24:58  \n",
       "3        21.05     863.05 2014-01-01 15:05:22  \n",
       "14       71.56    1645.88 2014-01-04 08:57:48  \n",
       "15       71.51    2431.34 2014-01-04 11:34:58  "
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[df[\"quantity\"] > 22].head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If we want to do more complex filtering, we can use `map` to filter. In this example, let's look for items with sku's that start with B1."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>account number</th>\n",
       "      <th>name</th>\n",
       "      <th>sku</th>\n",
       "      <th>quantity</th>\n",
       "      <th>unit price</th>\n",
       "      <th>ext price</th>\n",
       "      <th>date</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>740150</td>\n",
       "      <td>Barton LLC</td>\n",
       "      <td>B1-20000</td>\n",
       "      <td>39</td>\n",
       "      <td>86.69</td>\n",
       "      <td>3380.91</td>\n",
       "      <td>2014-01-01 07:21:51</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>218895</td>\n",
       "      <td>Kulas Inc</td>\n",
       "      <td>B1-69924</td>\n",
       "      <td>23</td>\n",
       "      <td>90.70</td>\n",
       "      <td>2086.10</td>\n",
       "      <td>2014-01-01 13:24:58</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>218895</td>\n",
       "      <td>Kulas Inc</td>\n",
       "      <td>B1-65551</td>\n",
       "      <td>2</td>\n",
       "      <td>31.10</td>\n",
       "      <td>62.20</td>\n",
       "      <td>2014-01-02 10:57:23</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>737550</td>\n",
       "      <td>Fritsch, Russel and Anderson</td>\n",
       "      <td>B1-53102</td>\n",
       "      <td>23</td>\n",
       "      <td>71.56</td>\n",
       "      <td>1645.88</td>\n",
       "      <td>2014-01-04 08:57:48</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>239344</td>\n",
       "      <td>Stokes LLC</td>\n",
       "      <td>B1-50809</td>\n",
       "      <td>14</td>\n",
       "      <td>16.23</td>\n",
       "      <td>227.22</td>\n",
       "      <td>2014-01-04 22:14:32</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    account number                          name       sku  quantity  \\\n",
       "0           740150                    Barton LLC  B1-20000        39   \n",
       "2           218895                     Kulas Inc  B1-69924        23   \n",
       "6           218895                     Kulas Inc  B1-65551         2   \n",
       "14          737550  Fritsch, Russel and Anderson  B1-53102        23   \n",
       "17          239344                    Stokes LLC  B1-50809        14   \n",
       "\n",
       "    unit price  ext price                date  \n",
       "0        86.69    3380.91 2014-01-01 07:21:51  \n",
       "2        90.70    2086.10 2014-01-01 13:24:58  \n",
       "6        31.10      62.20 2014-01-02 10:57:23  \n",
       "14       71.56    1645.88 2014-01-04 08:57:48  \n",
       "17       16.23     227.22 2014-01-04 22:14:32  "
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[df[\"sku\"].map(lambda x: x.startswith('B1'))].head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It's easy to chain two statements together using the &."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>account number</th>\n",
       "      <th>name</th>\n",
       "      <th>sku</th>\n",
       "      <th>quantity</th>\n",
       "      <th>unit price</th>\n",
       "      <th>ext price</th>\n",
       "      <th>date</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>740150</td>\n",
       "      <td>Barton LLC</td>\n",
       "      <td>B1-20000</td>\n",
       "      <td>39</td>\n",
       "      <td>86.69</td>\n",
       "      <td>3380.91</td>\n",
       "      <td>2014-01-01 07:21:51</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>218895</td>\n",
       "      <td>Kulas Inc</td>\n",
       "      <td>B1-69924</td>\n",
       "      <td>23</td>\n",
       "      <td>90.70</td>\n",
       "      <td>2086.10</td>\n",
       "      <td>2014-01-01 13:24:58</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>737550</td>\n",
       "      <td>Fritsch, Russel and Anderson</td>\n",
       "      <td>B1-53102</td>\n",
       "      <td>23</td>\n",
       "      <td>71.56</td>\n",
       "      <td>1645.88</td>\n",
       "      <td>2014-01-04 08:57:48</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>737550</td>\n",
       "      <td>Fritsch, Russel and Anderson</td>\n",
       "      <td>B1-53636</td>\n",
       "      <td>42</td>\n",
       "      <td>42.06</td>\n",
       "      <td>1766.52</td>\n",
       "      <td>2014-01-08 00:02:11</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>31</th>\n",
       "      <td>714466</td>\n",
       "      <td>Trantow-Barrows</td>\n",
       "      <td>B1-33087</td>\n",
       "      <td>32</td>\n",
       "      <td>19.56</td>\n",
       "      <td>625.92</td>\n",
       "      <td>2014-01-09 10:16:32</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    account number                          name       sku  quantity  \\\n",
       "0           740150                    Barton LLC  B1-20000        39   \n",
       "2           218895                     Kulas Inc  B1-69924        23   \n",
       "14          737550  Fritsch, Russel and Anderson  B1-53102        23   \n",
       "26          737550  Fritsch, Russel and Anderson  B1-53636        42   \n",
       "31          714466               Trantow-Barrows  B1-33087        32   \n",
       "\n",
       "    unit price  ext price                date  \n",
       "0        86.69    3380.91 2014-01-01 07:21:51  \n",
       "2        90.70    2086.10 2014-01-01 13:24:58  \n",
       "14       71.56    1645.88 2014-01-04 08:57:48  \n",
       "26       42.06    1766.52 2014-01-08 00:02:11  \n",
       "31       19.56     625.92 2014-01-09 10:16:32  "
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[df[\"sku\"].map(lambda x: x.startswith('B1')) & (df[\"quantity\"] > 22)].head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Another useful function that pandas supports is called `isin`. It allows us to define a list of values we want to look for.\n",
    "\n",
    "In this case, we look for all records that include two specific account numbers."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>account number</th>\n",
       "      <th>name</th>\n",
       "      <th>sku</th>\n",
       "      <th>quantity</th>\n",
       "      <th>unit price</th>\n",
       "      <th>ext price</th>\n",
       "      <th>date</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>714466</td>\n",
       "      <td>Trantow-Barrows</td>\n",
       "      <td>S2-77896</td>\n",
       "      <td>-1</td>\n",
       "      <td>63.16</td>\n",
       "      <td>-63.16</td>\n",
       "      <td>2014-01-01 10:00:47</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>218895</td>\n",
       "      <td>Kulas Inc</td>\n",
       "      <td>B1-69924</td>\n",
       "      <td>23</td>\n",
       "      <td>90.70</td>\n",
       "      <td>2086.10</td>\n",
       "      <td>2014-01-01 13:24:58</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>714466</td>\n",
       "      <td>Trantow-Barrows</td>\n",
       "      <td>S2-77896</td>\n",
       "      <td>17</td>\n",
       "      <td>87.63</td>\n",
       "      <td>1489.71</td>\n",
       "      <td>2014-01-02 10:07:15</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>218895</td>\n",
       "      <td>Kulas Inc</td>\n",
       "      <td>B1-65551</td>\n",
       "      <td>2</td>\n",
       "      <td>31.10</td>\n",
       "      <td>62.20</td>\n",
       "      <td>2014-01-02 10:57:23</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>714466</td>\n",
       "      <td>Trantow-Barrows</td>\n",
       "      <td>S1-50961</td>\n",
       "      <td>22</td>\n",
       "      <td>84.09</td>\n",
       "      <td>1849.98</td>\n",
       "      <td>2014-01-03 11:29:02</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   account number             name       sku  quantity  unit price  ext price  \\\n",
       "1          714466  Trantow-Barrows  S2-77896        -1       63.16     -63.16   \n",
       "2          218895        Kulas Inc  B1-69924        23       90.70    2086.10   \n",
       "5          714466  Trantow-Barrows  S2-77896        17       87.63    1489.71   \n",
       "6          218895        Kulas Inc  B1-65551         2       31.10      62.20   \n",
       "8          714466  Trantow-Barrows  S1-50961        22       84.09    1849.98   \n",
       "\n",
       "                 date  \n",
       "1 2014-01-01 10:00:47  \n",
       "2 2014-01-01 13:24:58  \n",
       "5 2014-01-02 10:07:15  \n",
       "6 2014-01-02 10:57:23  \n",
       "8 2014-01-03 11:29:02  "
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[df[\"account number\"].isin([714466,218895])].head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Pandas supports another function called `query` which allows you to efficiently select subsets of data. It does require the installation of [numexpr](https://github.com/pydata/numexpr) so make sure you have it installed before trying this step.\n",
    "\n",
    "If you would like to get a list of customers by name, you can do that with a query, similar to the python syntax shown above."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>account number</th>\n",
       "      <th>name</th>\n",
       "      <th>sku</th>\n",
       "      <th>quantity</th>\n",
       "      <th>unit price</th>\n",
       "      <th>ext price</th>\n",
       "      <th>date</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>740150</td>\n",
       "      <td>Barton LLC</td>\n",
       "      <td>B1-20000</td>\n",
       "      <td>39</td>\n",
       "      <td>86.69</td>\n",
       "      <td>3380.91</td>\n",
       "      <td>2014-01-01 07:21:51</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>218895</td>\n",
       "      <td>Kulas Inc</td>\n",
       "      <td>B1-69924</td>\n",
       "      <td>23</td>\n",
       "      <td>90.70</td>\n",
       "      <td>2086.10</td>\n",
       "      <td>2014-01-01 13:24:58</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>218895</td>\n",
       "      <td>Kulas Inc</td>\n",
       "      <td>B1-65551</td>\n",
       "      <td>2</td>\n",
       "      <td>31.10</td>\n",
       "      <td>62.20</td>\n",
       "      <td>2014-01-02 10:57:23</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>33</th>\n",
       "      <td>218895</td>\n",
       "      <td>Kulas Inc</td>\n",
       "      <td>S1-06532</td>\n",
       "      <td>3</td>\n",
       "      <td>22.36</td>\n",
       "      <td>67.08</td>\n",
       "      <td>2014-01-09 23:58:27</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36</th>\n",
       "      <td>218895</td>\n",
       "      <td>Kulas Inc</td>\n",
       "      <td>S2-34077</td>\n",
       "      <td>16</td>\n",
       "      <td>73.04</td>\n",
       "      <td>1168.64</td>\n",
       "      <td>2014-01-10 12:07:30</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    account number        name       sku  quantity  unit price  ext price  \\\n",
       "0           740150  Barton LLC  B1-20000        39       86.69    3380.91   \n",
       "2           218895   Kulas Inc  B1-69924        23       90.70    2086.10   \n",
       "6           218895   Kulas Inc  B1-65551         2       31.10      62.20   \n",
       "33          218895   Kulas Inc  S1-06532         3       22.36      67.08   \n",
       "36          218895   Kulas Inc  S2-34077        16       73.04    1168.64   \n",
       "\n",
       "                  date  \n",
       "0  2014-01-01 07:21:51  \n",
       "2  2014-01-01 13:24:58  \n",
       "6  2014-01-02 10:57:23  \n",
       "33 2014-01-09 23:58:27  \n",
       "36 2014-01-10 12:07:30  "
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.query('name == [\"Kulas Inc\",\"Barton LLC\"]').head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The query function allows you do more than just this simple example but for the purposes of this discussion, I'm showing it so you are aware that it is out there for you."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Working with Dates"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using pandas, you can do complex filtering on dates. Before doing anything with dates, I encourage you to sort by the date column to make sure the results return what you are expecting."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>account number</th>\n",
       "      <th>name</th>\n",
       "      <th>sku</th>\n",
       "      <th>quantity</th>\n",
       "      <th>unit price</th>\n",
       "      <th>ext price</th>\n",
       "      <th>date</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>740150</td>\n",
       "      <td>Barton LLC</td>\n",
       "      <td>B1-20000</td>\n",
       "      <td>39</td>\n",
       "      <td>86.69</td>\n",
       "      <td>3380.91</td>\n",
       "      <td>2014-01-01 07:21:51</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>714466</td>\n",
       "      <td>Trantow-Barrows</td>\n",
       "      <td>S2-77896</td>\n",
       "      <td>-1</td>\n",
       "      <td>63.16</td>\n",
       "      <td>-63.16</td>\n",
       "      <td>2014-01-01 10:00:47</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>218895</td>\n",
       "      <td>Kulas Inc</td>\n",
       "      <td>B1-69924</td>\n",
       "      <td>23</td>\n",
       "      <td>90.70</td>\n",
       "      <td>2086.10</td>\n",
       "      <td>2014-01-01 13:24:58</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>307599</td>\n",
       "      <td>Kassulke, Ondricka and Metz</td>\n",
       "      <td>S1-65481</td>\n",
       "      <td>41</td>\n",
       "      <td>21.05</td>\n",
       "      <td>863.05</td>\n",
       "      <td>2014-01-01 15:05:22</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>412290</td>\n",
       "      <td>Jerde-Hilpert</td>\n",
       "      <td>S2-34077</td>\n",
       "      <td>6</td>\n",
       "      <td>83.21</td>\n",
       "      <td>499.26</td>\n",
       "      <td>2014-01-01 23:26:55</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   account number                         name       sku  quantity  \\\n",
       "0          740150                   Barton LLC  B1-20000        39   \n",
       "1          714466              Trantow-Barrows  S2-77896        -1   \n",
       "2          218895                    Kulas Inc  B1-69924        23   \n",
       "3          307599  Kassulke, Ondricka and Metz  S1-65481        41   \n",
       "4          412290                Jerde-Hilpert  S2-34077         6   \n",
       "\n",
       "   unit price  ext price                date  \n",
       "0       86.69    3380.91 2014-01-01 07:21:51  \n",
       "1       63.16     -63.16 2014-01-01 10:00:47  \n",
       "2       90.70    2086.10 2014-01-01 13:24:58  \n",
       "3       21.05     863.05 2014-01-01 15:05:22  \n",
       "4       83.21     499.26 2014-01-01 23:26:55  "
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = df.sort_values(by='date')\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The python filtering syntax shown before works with dates."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>account number</th>\n",
       "      <th>name</th>\n",
       "      <th>sku</th>\n",
       "      <th>quantity</th>\n",
       "      <th>unit price</th>\n",
       "      <th>ext price</th>\n",
       "      <th>date</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1042</th>\n",
       "      <td>163416</td>\n",
       "      <td>Purdy-Kunde</td>\n",
       "      <td>B1-38851</td>\n",
       "      <td>41</td>\n",
       "      <td>98.69</td>\n",
       "      <td>4046.29</td>\n",
       "      <td>2014-09-05 01:52:32</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1043</th>\n",
       "      <td>714466</td>\n",
       "      <td>Trantow-Barrows</td>\n",
       "      <td>S1-30248</td>\n",
       "      <td>1</td>\n",
       "      <td>37.16</td>\n",
       "      <td>37.16</td>\n",
       "      <td>2014-09-05 06:17:19</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1044</th>\n",
       "      <td>729833</td>\n",
       "      <td>Koepp Ltd</td>\n",
       "      <td>S1-65481</td>\n",
       "      <td>48</td>\n",
       "      <td>16.04</td>\n",
       "      <td>769.92</td>\n",
       "      <td>2014-09-05 08:54:41</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1045</th>\n",
       "      <td>729833</td>\n",
       "      <td>Koepp Ltd</td>\n",
       "      <td>S2-11481</td>\n",
       "      <td>6</td>\n",
       "      <td>26.50</td>\n",
       "      <td>159.00</td>\n",
       "      <td>2014-09-05 16:33:15</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1046</th>\n",
       "      <td>737550</td>\n",
       "      <td>Fritsch, Russel and Anderson</td>\n",
       "      <td>B1-33364</td>\n",
       "      <td>4</td>\n",
       "      <td>76.44</td>\n",
       "      <td>305.76</td>\n",
       "      <td>2014-09-06 08:59:08</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      account number                          name       sku  quantity  \\\n",
       "1042          163416                   Purdy-Kunde  B1-38851        41   \n",
       "1043          714466               Trantow-Barrows  S1-30248         1   \n",
       "1044          729833                     Koepp Ltd  S1-65481        48   \n",
       "1045          729833                     Koepp Ltd  S2-11481         6   \n",
       "1046          737550  Fritsch, Russel and Anderson  B1-33364         4   \n",
       "\n",
       "      unit price  ext price                date  \n",
       "1042       98.69    4046.29 2014-09-05 01:52:32  \n",
       "1043       37.16      37.16 2014-09-05 06:17:19  \n",
       "1044       16.04     769.92 2014-09-05 08:54:41  \n",
       "1045       26.50     159.00 2014-09-05 16:33:15  \n",
       "1046       76.44     305.76 2014-09-06 08:59:08  "
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[df['date'] >='20140905'].head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "One of the really nice features of pandas is that it understands dates so will allow us to do partial filtering. If we want to only look for data more recent than a specific month, we can do so."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>account number</th>\n",
       "      <th>name</th>\n",
       "      <th>sku</th>\n",
       "      <th>quantity</th>\n",
       "      <th>unit price</th>\n",
       "      <th>ext price</th>\n",
       "      <th>date</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>242</th>\n",
       "      <td>163416</td>\n",
       "      <td>Purdy-Kunde</td>\n",
       "      <td>S1-30248</td>\n",
       "      <td>19</td>\n",
       "      <td>65.03</td>\n",
       "      <td>1235.57</td>\n",
       "      <td>2014-03-01 16:07:40</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>243</th>\n",
       "      <td>527099</td>\n",
       "      <td>Sanford and Sons</td>\n",
       "      <td>S2-82423</td>\n",
       "      <td>3</td>\n",
       "      <td>76.21</td>\n",
       "      <td>228.63</td>\n",
       "      <td>2014-03-01 17:18:01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>244</th>\n",
       "      <td>527099</td>\n",
       "      <td>Sanford and Sons</td>\n",
       "      <td>B1-50809</td>\n",
       "      <td>8</td>\n",
       "      <td>70.78</td>\n",
       "      <td>566.24</td>\n",
       "      <td>2014-03-01 18:53:09</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>245</th>\n",
       "      <td>737550</td>\n",
       "      <td>Fritsch, Russel and Anderson</td>\n",
       "      <td>B1-50809</td>\n",
       "      <td>20</td>\n",
       "      <td>50.11</td>\n",
       "      <td>1002.20</td>\n",
       "      <td>2014-03-01 23:47:17</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>246</th>\n",
       "      <td>688981</td>\n",
       "      <td>Keeling LLC</td>\n",
       "      <td>B1-86481</td>\n",
       "      <td>-1</td>\n",
       "      <td>97.16</td>\n",
       "      <td>-97.16</td>\n",
       "      <td>2014-03-02 01:46:44</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     account number                          name       sku  quantity  \\\n",
       "242          163416                   Purdy-Kunde  S1-30248        19   \n",
       "243          527099              Sanford and Sons  S2-82423         3   \n",
       "244          527099              Sanford and Sons  B1-50809         8   \n",
       "245          737550  Fritsch, Russel and Anderson  B1-50809        20   \n",
       "246          688981                   Keeling LLC  B1-86481        -1   \n",
       "\n",
       "     unit price  ext price                date  \n",
       "242       65.03    1235.57 2014-03-01 16:07:40  \n",
       "243       76.21     228.63 2014-03-01 17:18:01  \n",
       "244       70.78     566.24 2014-03-01 18:53:09  \n",
       "245       50.11    1002.20 2014-03-01 23:47:17  \n",
       "246       97.16     -97.16 2014-03-02 01:46:44  "
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[df['date'] >='2014-03'].head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Of course, you can chain the criteria."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>account number</th>\n",
       "      <th>name</th>\n",
       "      <th>sku</th>\n",
       "      <th>quantity</th>\n",
       "      <th>unit price</th>\n",
       "      <th>ext price</th>\n",
       "      <th>date</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>778</th>\n",
       "      <td>737550</td>\n",
       "      <td>Fritsch, Russel and Anderson</td>\n",
       "      <td>S1-65481</td>\n",
       "      <td>35</td>\n",
       "      <td>70.51</td>\n",
       "      <td>2467.85</td>\n",
       "      <td>2014-07-01 00:21:58</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>779</th>\n",
       "      <td>218895</td>\n",
       "      <td>Kulas Inc</td>\n",
       "      <td>S1-30248</td>\n",
       "      <td>9</td>\n",
       "      <td>16.56</td>\n",
       "      <td>149.04</td>\n",
       "      <td>2014-07-01 00:52:38</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>780</th>\n",
       "      <td>163416</td>\n",
       "      <td>Purdy-Kunde</td>\n",
       "      <td>S2-82423</td>\n",
       "      <td>44</td>\n",
       "      <td>68.27</td>\n",
       "      <td>3003.88</td>\n",
       "      <td>2014-07-01 08:15:52</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>781</th>\n",
       "      <td>672390</td>\n",
       "      <td>Kuhn-Gusikowski</td>\n",
       "      <td>B1-04202</td>\n",
       "      <td>48</td>\n",
       "      <td>99.39</td>\n",
       "      <td>4770.72</td>\n",
       "      <td>2014-07-01 11:12:13</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>782</th>\n",
       "      <td>642753</td>\n",
       "      <td>Pollich LLC</td>\n",
       "      <td>S2-23246</td>\n",
       "      <td>1</td>\n",
       "      <td>51.29</td>\n",
       "      <td>51.29</td>\n",
       "      <td>2014-07-02 04:02:39</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     account number                          name       sku  quantity  \\\n",
       "778          737550  Fritsch, Russel and Anderson  S1-65481        35   \n",
       "779          218895                     Kulas Inc  S1-30248         9   \n",
       "780          163416                   Purdy-Kunde  S2-82423        44   \n",
       "781          672390               Kuhn-Gusikowski  B1-04202        48   \n",
       "782          642753                   Pollich LLC  S2-23246         1   \n",
       "\n",
       "     unit price  ext price                date  \n",
       "778       70.51    2467.85 2014-07-01 00:21:58  \n",
       "779       16.56     149.04 2014-07-01 00:52:38  \n",
       "780       68.27    3003.88 2014-07-01 08:15:52  \n",
       "781       99.39    4770.72 2014-07-01 11:12:13  \n",
       "782       51.29      51.29 2014-07-02 04:02:39  "
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[(df['date'] >='20140701') & (df['date'] <= '20140715')].head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Because pandas understands date columns, you can express the date value in multiple formats and it will give you the results you expect."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>account number</th>\n",
       "      <th>name</th>\n",
       "      <th>sku</th>\n",
       "      <th>quantity</th>\n",
       "      <th>unit price</th>\n",
       "      <th>ext price</th>\n",
       "      <th>date</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1141</th>\n",
       "      <td>307599</td>\n",
       "      <td>Kassulke, Ondricka and Metz</td>\n",
       "      <td>B1-50809</td>\n",
       "      <td>25</td>\n",
       "      <td>56.63</td>\n",
       "      <td>1415.75</td>\n",
       "      <td>2014-10-01 10:56:32</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1142</th>\n",
       "      <td>737550</td>\n",
       "      <td>Fritsch, Russel and Anderson</td>\n",
       "      <td>S2-82423</td>\n",
       "      <td>38</td>\n",
       "      <td>45.17</td>\n",
       "      <td>1716.46</td>\n",
       "      <td>2014-10-01 16:17:24</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1143</th>\n",
       "      <td>737550</td>\n",
       "      <td>Fritsch, Russel and Anderson</td>\n",
       "      <td>S1-47412</td>\n",
       "      <td>6</td>\n",
       "      <td>68.68</td>\n",
       "      <td>412.08</td>\n",
       "      <td>2014-10-01 22:28:49</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1144</th>\n",
       "      <td>146832</td>\n",
       "      <td>Kiehn-Spinka</td>\n",
       "      <td>S2-11481</td>\n",
       "      <td>13</td>\n",
       "      <td>18.80</td>\n",
       "      <td>244.40</td>\n",
       "      <td>2014-10-02 00:31:01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1145</th>\n",
       "      <td>424914</td>\n",
       "      <td>White-Trantow</td>\n",
       "      <td>B1-53102</td>\n",
       "      <td>9</td>\n",
       "      <td>94.47</td>\n",
       "      <td>850.23</td>\n",
       "      <td>2014-10-02 02:48:26</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      account number                          name       sku  quantity  \\\n",
       "1141          307599   Kassulke, Ondricka and Metz  B1-50809        25   \n",
       "1142          737550  Fritsch, Russel and Anderson  S2-82423        38   \n",
       "1143          737550  Fritsch, Russel and Anderson  S1-47412         6   \n",
       "1144          146832                  Kiehn-Spinka  S2-11481        13   \n",
       "1145          424914                 White-Trantow  B1-53102         9   \n",
       "\n",
       "      unit price  ext price                date  \n",
       "1141       56.63    1415.75 2014-10-01 10:56:32  \n",
       "1142       45.17    1716.46 2014-10-01 16:17:24  \n",
       "1143       68.68     412.08 2014-10-01 22:28:49  \n",
       "1144       18.80     244.40 2014-10-02 00:31:01  \n",
       "1145       94.47     850.23 2014-10-02 02:48:26  "
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[df['date'] >= 'Oct-2014'].head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>account number</th>\n",
       "      <th>name</th>\n",
       "      <th>sku</th>\n",
       "      <th>quantity</th>\n",
       "      <th>unit price</th>\n",
       "      <th>ext price</th>\n",
       "      <th>date</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1174</th>\n",
       "      <td>257198</td>\n",
       "      <td>Cronin, Oberbrunner and Spencer</td>\n",
       "      <td>S2-34077</td>\n",
       "      <td>13</td>\n",
       "      <td>12.24</td>\n",
       "      <td>159.12</td>\n",
       "      <td>2014-10-10 02:59:06</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1175</th>\n",
       "      <td>740150</td>\n",
       "      <td>Barton LLC</td>\n",
       "      <td>S1-65481</td>\n",
       "      <td>28</td>\n",
       "      <td>53.00</td>\n",
       "      <td>1484.00</td>\n",
       "      <td>2014-10-10 15:08:53</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1176</th>\n",
       "      <td>146832</td>\n",
       "      <td>Kiehn-Spinka</td>\n",
       "      <td>S1-27722</td>\n",
       "      <td>15</td>\n",
       "      <td>64.39</td>\n",
       "      <td>965.85</td>\n",
       "      <td>2014-10-10 18:24:01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1177</th>\n",
       "      <td>257198</td>\n",
       "      <td>Cronin, Oberbrunner and Spencer</td>\n",
       "      <td>S2-16558</td>\n",
       "      <td>3</td>\n",
       "      <td>35.34</td>\n",
       "      <td>106.02</td>\n",
       "      <td>2014-10-11 01:48:13</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1178</th>\n",
       "      <td>737550</td>\n",
       "      <td>Fritsch, Russel and Anderson</td>\n",
       "      <td>B1-53636</td>\n",
       "      <td>10</td>\n",
       "      <td>56.95</td>\n",
       "      <td>569.50</td>\n",
       "      <td>2014-10-11 10:25:53</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      account number                             name       sku  quantity  \\\n",
       "1174          257198  Cronin, Oberbrunner and Spencer  S2-34077        13   \n",
       "1175          740150                       Barton LLC  S1-65481        28   \n",
       "1176          146832                     Kiehn-Spinka  S1-27722        15   \n",
       "1177          257198  Cronin, Oberbrunner and Spencer  S2-16558         3   \n",
       "1178          737550     Fritsch, Russel and Anderson  B1-53636        10   \n",
       "\n",
       "      unit price  ext price                date  \n",
       "1174       12.24     159.12 2014-10-10 02:59:06  \n",
       "1175       53.00    1484.00 2014-10-10 15:08:53  \n",
       "1176       64.39     965.85 2014-10-10 18:24:01  \n",
       "1177       35.34     106.02 2014-10-11 01:48:13  \n",
       "1178       56.95     569.50 2014-10-11 10:25:53  "
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[df['date'] >= '10-10-2014'].head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When working with time series data, if we convert the data to use the date as at the index, we can do some more filtering.\n",
    "\n",
    "Set the new index using `set_index`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>account number</th>\n",
       "      <th>name</th>\n",
       "      <th>sku</th>\n",
       "      <th>quantity</th>\n",
       "      <th>unit price</th>\n",
       "      <th>ext price</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>date</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2014-01-01 07:21:51</th>\n",
       "      <td>740150</td>\n",
       "      <td>Barton LLC</td>\n",
       "      <td>B1-20000</td>\n",
       "      <td>39</td>\n",
       "      <td>86.69</td>\n",
       "      <td>3380.91</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014-01-01 10:00:47</th>\n",
       "      <td>714466</td>\n",
       "      <td>Trantow-Barrows</td>\n",
       "      <td>S2-77896</td>\n",
       "      <td>-1</td>\n",
       "      <td>63.16</td>\n",
       "      <td>-63.16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014-01-01 13:24:58</th>\n",
       "      <td>218895</td>\n",
       "      <td>Kulas Inc</td>\n",
       "      <td>B1-69924</td>\n",
       "      <td>23</td>\n",
       "      <td>90.70</td>\n",
       "      <td>2086.10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014-01-01 15:05:22</th>\n",
       "      <td>307599</td>\n",
       "      <td>Kassulke, Ondricka and Metz</td>\n",
       "      <td>S1-65481</td>\n",
       "      <td>41</td>\n",
       "      <td>21.05</td>\n",
       "      <td>863.05</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014-01-01 23:26:55</th>\n",
       "      <td>412290</td>\n",
       "      <td>Jerde-Hilpert</td>\n",
       "      <td>S2-34077</td>\n",
       "      <td>6</td>\n",
       "      <td>83.21</td>\n",
       "      <td>499.26</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                     account number                         name       sku  \\\n",
       "date                                                                         \n",
       "2014-01-01 07:21:51          740150                   Barton LLC  B1-20000   \n",
       "2014-01-01 10:00:47          714466              Trantow-Barrows  S2-77896   \n",
       "2014-01-01 13:24:58          218895                    Kulas Inc  B1-69924   \n",
       "2014-01-01 15:05:22          307599  Kassulke, Ondricka and Metz  S1-65481   \n",
       "2014-01-01 23:26:55          412290                Jerde-Hilpert  S2-34077   \n",
       "\n",
       "                     quantity  unit price  ext price  \n",
       "date                                                  \n",
       "2014-01-01 07:21:51        39       86.69    3380.91  \n",
       "2014-01-01 10:00:47        -1       63.16     -63.16  \n",
       "2014-01-01 13:24:58        23       90.70    2086.10  \n",
       "2014-01-01 15:05:22        41       21.05     863.05  \n",
       "2014-01-01 23:26:55         6       83.21     499.26  "
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2 = df.set_index(['date'])\n",
    "df2.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can slice the data to get a range."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>account number</th>\n",
       "      <th>name</th>\n",
       "      <th>sku</th>\n",
       "      <th>quantity</th>\n",
       "      <th>unit price</th>\n",
       "      <th>ext price</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>date</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2014-01-01 07:21:51</th>\n",
       "      <td>740150</td>\n",
       "      <td>Barton LLC</td>\n",
       "      <td>B1-20000</td>\n",
       "      <td>39</td>\n",
       "      <td>86.69</td>\n",
       "      <td>3380.91</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014-01-01 10:00:47</th>\n",
       "      <td>714466</td>\n",
       "      <td>Trantow-Barrows</td>\n",
       "      <td>S2-77896</td>\n",
       "      <td>-1</td>\n",
       "      <td>63.16</td>\n",
       "      <td>-63.16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014-01-01 13:24:58</th>\n",
       "      <td>218895</td>\n",
       "      <td>Kulas Inc</td>\n",
       "      <td>B1-69924</td>\n",
       "      <td>23</td>\n",
       "      <td>90.70</td>\n",
       "      <td>2086.10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014-01-01 15:05:22</th>\n",
       "      <td>307599</td>\n",
       "      <td>Kassulke, Ondricka and Metz</td>\n",
       "      <td>S1-65481</td>\n",
       "      <td>41</td>\n",
       "      <td>21.05</td>\n",
       "      <td>863.05</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014-01-01 23:26:55</th>\n",
       "      <td>412290</td>\n",
       "      <td>Jerde-Hilpert</td>\n",
       "      <td>S2-34077</td>\n",
       "      <td>6</td>\n",
       "      <td>83.21</td>\n",
       "      <td>499.26</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                     account number                         name       sku  \\\n",
       "date                                                                         \n",
       "2014-01-01 07:21:51          740150                   Barton LLC  B1-20000   \n",
       "2014-01-01 10:00:47          714466              Trantow-Barrows  S2-77896   \n",
       "2014-01-01 13:24:58          218895                    Kulas Inc  B1-69924   \n",
       "2014-01-01 15:05:22          307599  Kassulke, Ondricka and Metz  S1-65481   \n",
       "2014-01-01 23:26:55          412290                Jerde-Hilpert  S2-34077   \n",
       "\n",
       "                     quantity  unit price  ext price  \n",
       "date                                                  \n",
       "2014-01-01 07:21:51        39       86.69    3380.91  \n",
       "2014-01-01 10:00:47        -1       63.16     -63.16  \n",
       "2014-01-01 13:24:58        23       90.70    2086.10  \n",
       "2014-01-01 15:05:22        41       21.05     863.05  \n",
       "2014-01-01 23:26:55         6       83.21     499.26  "
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2[\"20140101\":\"20140201\"].head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Once again, we can use various date representations to remove any ambiguity around date naming conventions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>account number</th>\n",
       "      <th>name</th>\n",
       "      <th>sku</th>\n",
       "      <th>quantity</th>\n",
       "      <th>unit price</th>\n",
       "      <th>ext price</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>date</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2014-01-01 07:21:51</th>\n",
       "      <td>740150</td>\n",
       "      <td>Barton LLC</td>\n",
       "      <td>B1-20000</td>\n",
       "      <td>39</td>\n",
       "      <td>86.69</td>\n",
       "      <td>3380.91</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014-01-01 10:00:47</th>\n",
       "      <td>714466</td>\n",
       "      <td>Trantow-Barrows</td>\n",
       "      <td>S2-77896</td>\n",
       "      <td>-1</td>\n",
       "      <td>63.16</td>\n",
       "      <td>-63.16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014-01-01 13:24:58</th>\n",
       "      <td>218895</td>\n",
       "      <td>Kulas Inc</td>\n",
       "      <td>B1-69924</td>\n",
       "      <td>23</td>\n",
       "      <td>90.70</td>\n",
       "      <td>2086.10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014-01-01 15:05:22</th>\n",
       "      <td>307599</td>\n",
       "      <td>Kassulke, Ondricka and Metz</td>\n",
       "      <td>S1-65481</td>\n",
       "      <td>41</td>\n",
       "      <td>21.05</td>\n",
       "      <td>863.05</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014-01-01 23:26:55</th>\n",
       "      <td>412290</td>\n",
       "      <td>Jerde-Hilpert</td>\n",
       "      <td>S2-34077</td>\n",
       "      <td>6</td>\n",
       "      <td>83.21</td>\n",
       "      <td>499.26</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                     account number                         name       sku  \\\n",
       "date                                                                         \n",
       "2014-01-01 07:21:51          740150                   Barton LLC  B1-20000   \n",
       "2014-01-01 10:00:47          714466              Trantow-Barrows  S2-77896   \n",
       "2014-01-01 13:24:58          218895                    Kulas Inc  B1-69924   \n",
       "2014-01-01 15:05:22          307599  Kassulke, Ondricka and Metz  S1-65481   \n",
       "2014-01-01 23:26:55          412290                Jerde-Hilpert  S2-34077   \n",
       "\n",
       "                     quantity  unit price  ext price  \n",
       "date                                                  \n",
       "2014-01-01 07:21:51        39       86.69    3380.91  \n",
       "2014-01-01 10:00:47        -1       63.16     -63.16  \n",
       "2014-01-01 13:24:58        23       90.70    2086.10  \n",
       "2014-01-01 15:05:22        41       21.05     863.05  \n",
       "2014-01-01 23:26:55         6       83.21     499.26  "
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2[\"2014-Jan-1\":\"2014-Feb-1\"].head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>account number</th>\n",
       "      <th>name</th>\n",
       "      <th>sku</th>\n",
       "      <th>quantity</th>\n",
       "      <th>unit price</th>\n",
       "      <th>ext price</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>date</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2014-01-31 22:51:18</th>\n",
       "      <td>383080</td>\n",
       "      <td>Will LLC</td>\n",
       "      <td>B1-05914</td>\n",
       "      <td>43</td>\n",
       "      <td>80.17</td>\n",
       "      <td>3447.31</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014-02-01 09:04:59</th>\n",
       "      <td>383080</td>\n",
       "      <td>Will LLC</td>\n",
       "      <td>B1-20000</td>\n",
       "      <td>7</td>\n",
       "      <td>33.69</td>\n",
       "      <td>235.83</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014-02-01 11:51:46</th>\n",
       "      <td>412290</td>\n",
       "      <td>Jerde-Hilpert</td>\n",
       "      <td>S1-27722</td>\n",
       "      <td>11</td>\n",
       "      <td>21.12</td>\n",
       "      <td>232.32</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014-02-01 17:24:32</th>\n",
       "      <td>412290</td>\n",
       "      <td>Jerde-Hilpert</td>\n",
       "      <td>B1-86481</td>\n",
       "      <td>3</td>\n",
       "      <td>35.99</td>\n",
       "      <td>107.97</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014-02-01 19:56:48</th>\n",
       "      <td>412290</td>\n",
       "      <td>Jerde-Hilpert</td>\n",
       "      <td>B1-20000</td>\n",
       "      <td>23</td>\n",
       "      <td>78.90</td>\n",
       "      <td>1814.70</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                     account number           name       sku  quantity  \\\n",
       "date                                                                     \n",
       "2014-01-31 22:51:18          383080       Will LLC  B1-05914        43   \n",
       "2014-02-01 09:04:59          383080       Will LLC  B1-20000         7   \n",
       "2014-02-01 11:51:46          412290  Jerde-Hilpert  S1-27722        11   \n",
       "2014-02-01 17:24:32          412290  Jerde-Hilpert  B1-86481         3   \n",
       "2014-02-01 19:56:48          412290  Jerde-Hilpert  B1-20000        23   \n",
       "\n",
       "                     unit price  ext price  \n",
       "date                                        \n",
       "2014-01-31 22:51:18       80.17    3447.31  \n",
       "2014-02-01 09:04:59       33.69     235.83  \n",
       "2014-02-01 11:51:46       21.12     232.32  \n",
       "2014-02-01 17:24:32       35.99     107.97  \n",
       "2014-02-01 19:56:48       78.90    1814.70  "
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2[\"2014-Jan-1\":\"2014-Feb-1\"].tail()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>account number</th>\n",
       "      <th>name</th>\n",
       "      <th>sku</th>\n",
       "      <th>quantity</th>\n",
       "      <th>unit price</th>\n",
       "      <th>ext price</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>date</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2014-01-01 07:21:51</th>\n",
       "      <td>740150</td>\n",
       "      <td>Barton LLC</td>\n",
       "      <td>B1-20000</td>\n",
       "      <td>39</td>\n",
       "      <td>86.69</td>\n",
       "      <td>3380.91</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014-01-01 10:00:47</th>\n",
       "      <td>714466</td>\n",
       "      <td>Trantow-Barrows</td>\n",
       "      <td>S2-77896</td>\n",
       "      <td>-1</td>\n",
       "      <td>63.16</td>\n",
       "      <td>-63.16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014-01-01 13:24:58</th>\n",
       "      <td>218895</td>\n",
       "      <td>Kulas Inc</td>\n",
       "      <td>B1-69924</td>\n",
       "      <td>23</td>\n",
       "      <td>90.70</td>\n",
       "      <td>2086.10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014-01-01 15:05:22</th>\n",
       "      <td>307599</td>\n",
       "      <td>Kassulke, Ondricka and Metz</td>\n",
       "      <td>S1-65481</td>\n",
       "      <td>41</td>\n",
       "      <td>21.05</td>\n",
       "      <td>863.05</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014-01-01 23:26:55</th>\n",
       "      <td>412290</td>\n",
       "      <td>Jerde-Hilpert</td>\n",
       "      <td>S2-34077</td>\n",
       "      <td>6</td>\n",
       "      <td>83.21</td>\n",
       "      <td>499.26</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                     account number                         name       sku  \\\n",
       "date                                                                         \n",
       "2014-01-01 07:21:51          740150                   Barton LLC  B1-20000   \n",
       "2014-01-01 10:00:47          714466              Trantow-Barrows  S2-77896   \n",
       "2014-01-01 13:24:58          218895                    Kulas Inc  B1-69924   \n",
       "2014-01-01 15:05:22          307599  Kassulke, Ondricka and Metz  S1-65481   \n",
       "2014-01-01 23:26:55          412290                Jerde-Hilpert  S2-34077   \n",
       "\n",
       "                     quantity  unit price  ext price  \n",
       "date                                                  \n",
       "2014-01-01 07:21:51        39       86.69    3380.91  \n",
       "2014-01-01 10:00:47        -1       63.16     -63.16  \n",
       "2014-01-01 13:24:58        23       90.70    2086.10  \n",
       "2014-01-01 15:05:22        41       21.05     863.05  \n",
       "2014-01-01 23:26:55         6       83.21     499.26  "
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2[\"2014\"].head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>account number</th>\n",
       "      <th>name</th>\n",
       "      <th>sku</th>\n",
       "      <th>quantity</th>\n",
       "      <th>unit price</th>\n",
       "      <th>ext price</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>date</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2014-12-01 20:15:34</th>\n",
       "      <td>714466</td>\n",
       "      <td>Trantow-Barrows</td>\n",
       "      <td>S1-82801</td>\n",
       "      <td>3</td>\n",
       "      <td>77.97</td>\n",
       "      <td>233.91</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014-12-02 20:00:04</th>\n",
       "      <td>146832</td>\n",
       "      <td>Kiehn-Spinka</td>\n",
       "      <td>S2-23246</td>\n",
       "      <td>37</td>\n",
       "      <td>57.81</td>\n",
       "      <td>2138.97</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014-12-03 04:43:53</th>\n",
       "      <td>218895</td>\n",
       "      <td>Kulas Inc</td>\n",
       "      <td>S2-77896</td>\n",
       "      <td>30</td>\n",
       "      <td>77.44</td>\n",
       "      <td>2323.20</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014-12-03 06:05:43</th>\n",
       "      <td>141962</td>\n",
       "      <td>Herman LLC</td>\n",
       "      <td>B1-53102</td>\n",
       "      <td>20</td>\n",
       "      <td>26.12</td>\n",
       "      <td>522.40</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014-12-03 14:17:34</th>\n",
       "      <td>642753</td>\n",
       "      <td>Pollich LLC</td>\n",
       "      <td>B1-53636</td>\n",
       "      <td>19</td>\n",
       "      <td>71.21</td>\n",
       "      <td>1352.99</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                     account number             name       sku  quantity  \\\n",
       "date                                                                       \n",
       "2014-12-01 20:15:34          714466  Trantow-Barrows  S1-82801         3   \n",
       "2014-12-02 20:00:04          146832     Kiehn-Spinka  S2-23246        37   \n",
       "2014-12-03 04:43:53          218895        Kulas Inc  S2-77896        30   \n",
       "2014-12-03 06:05:43          141962       Herman LLC  B1-53102        20   \n",
       "2014-12-03 14:17:34          642753      Pollich LLC  B1-53636        19   \n",
       "\n",
       "                     unit price  ext price  \n",
       "date                                        \n",
       "2014-12-01 20:15:34       77.97     233.91  \n",
       "2014-12-02 20:00:04       57.81    2138.97  \n",
       "2014-12-03 04:43:53       77.44    2323.20  \n",
       "2014-12-03 06:05:43       26.12     522.40  \n",
       "2014-12-03 14:17:34       71.21    1352.99  "
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2[\"2014-Dec\"].head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Additional String Functions"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Pandas has support for vectorized string functions as well. If we want to identify all the skus that contain a certain value, we can use `str.contains`. In this case, we know that the sku is always represented in the same way, so B1 only shows up in the front of the sku."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>account number</th>\n",
       "      <th>name</th>\n",
       "      <th>sku</th>\n",
       "      <th>quantity</th>\n",
       "      <th>unit price</th>\n",
       "      <th>ext price</th>\n",
       "      <th>date</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>740150</td>\n",
       "      <td>Barton LLC</td>\n",
       "      <td>B1-20000</td>\n",
       "      <td>39</td>\n",
       "      <td>86.69</td>\n",
       "      <td>3380.91</td>\n",
       "      <td>2014-01-01 07:21:51</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>218895</td>\n",
       "      <td>Kulas Inc</td>\n",
       "      <td>B1-69924</td>\n",
       "      <td>23</td>\n",
       "      <td>90.70</td>\n",
       "      <td>2086.10</td>\n",
       "      <td>2014-01-01 13:24:58</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>218895</td>\n",
       "      <td>Kulas Inc</td>\n",
       "      <td>B1-65551</td>\n",
       "      <td>2</td>\n",
       "      <td>31.10</td>\n",
       "      <td>62.20</td>\n",
       "      <td>2014-01-02 10:57:23</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>737550</td>\n",
       "      <td>Fritsch, Russel and Anderson</td>\n",
       "      <td>B1-53102</td>\n",
       "      <td>23</td>\n",
       "      <td>71.56</td>\n",
       "      <td>1645.88</td>\n",
       "      <td>2014-01-04 08:57:48</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>239344</td>\n",
       "      <td>Stokes LLC</td>\n",
       "      <td>B1-50809</td>\n",
       "      <td>14</td>\n",
       "      <td>16.23</td>\n",
       "      <td>227.22</td>\n",
       "      <td>2014-01-04 22:14:32</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    account number                          name       sku  quantity  \\\n",
       "0           740150                    Barton LLC  B1-20000        39   \n",
       "2           218895                     Kulas Inc  B1-69924        23   \n",
       "6           218895                     Kulas Inc  B1-65551         2   \n",
       "14          737550  Fritsch, Russel and Anderson  B1-53102        23   \n",
       "17          239344                    Stokes LLC  B1-50809        14   \n",
       "\n",
       "    unit price  ext price                date  \n",
       "0        86.69    3380.91 2014-01-01 07:21:51  \n",
       "2        90.70    2086.10 2014-01-01 13:24:58  \n",
       "6        31.10      62.20 2014-01-02 10:57:23  \n",
       "14       71.56    1645.88 2014-01-04 08:57:48  \n",
       "17       16.23     227.22 2014-01-04 22:14:32  "
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[df['sku'].str.contains('B1')].head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can string queries together and use sort to control how the data is ordered."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A common need I have in Excel is to understand all the unique items in a column. For instance, maybe I only want to know when customers purchased in this time period. The unique function makes this trivial."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>account number</th>\n",
       "      <th>name</th>\n",
       "      <th>sku</th>\n",
       "      <th>quantity</th>\n",
       "      <th>unit price</th>\n",
       "      <th>ext price</th>\n",
       "      <th>date</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>684</th>\n",
       "      <td>642753</td>\n",
       "      <td>Pollich LLC</td>\n",
       "      <td>B1-53102</td>\n",
       "      <td>46</td>\n",
       "      <td>26.07</td>\n",
       "      <td>1199.22</td>\n",
       "      <td>2014-06-08 19:33:33</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>792</th>\n",
       "      <td>688981</td>\n",
       "      <td>Keeling LLC</td>\n",
       "      <td>B1-53102</td>\n",
       "      <td>45</td>\n",
       "      <td>41.19</td>\n",
       "      <td>1853.55</td>\n",
       "      <td>2014-07-04 21:42:22</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>176</th>\n",
       "      <td>383080</td>\n",
       "      <td>Will LLC</td>\n",
       "      <td>B1-53102</td>\n",
       "      <td>45</td>\n",
       "      <td>89.22</td>\n",
       "      <td>4014.90</td>\n",
       "      <td>2014-02-11 04:14:09</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1213</th>\n",
       "      <td>604255</td>\n",
       "      <td>Halvorson, Crona and Champlin</td>\n",
       "      <td>B1-53102</td>\n",
       "      <td>41</td>\n",
       "      <td>55.05</td>\n",
       "      <td>2257.05</td>\n",
       "      <td>2014-10-18 19:27:01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1215</th>\n",
       "      <td>307599</td>\n",
       "      <td>Kassulke, Ondricka and Metz</td>\n",
       "      <td>B1-53102</td>\n",
       "      <td>41</td>\n",
       "      <td>93.70</td>\n",
       "      <td>3841.70</td>\n",
       "      <td>2014-10-18 23:25:10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1128</th>\n",
       "      <td>714466</td>\n",
       "      <td>Trantow-Barrows</td>\n",
       "      <td>B1-53102</td>\n",
       "      <td>41</td>\n",
       "      <td>55.68</td>\n",
       "      <td>2282.88</td>\n",
       "      <td>2014-09-27 10:42:48</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1001</th>\n",
       "      <td>424914</td>\n",
       "      <td>White-Trantow</td>\n",
       "      <td>B1-53102</td>\n",
       "      <td>41</td>\n",
       "      <td>81.25</td>\n",
       "      <td>3331.25</td>\n",
       "      <td>2014-08-26 11:44:30</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      account number                           name       sku  quantity  \\\n",
       "684           642753                    Pollich LLC  B1-53102        46   \n",
       "792           688981                    Keeling LLC  B1-53102        45   \n",
       "176           383080                       Will LLC  B1-53102        45   \n",
       "1213          604255  Halvorson, Crona and Champlin  B1-53102        41   \n",
       "1215          307599    Kassulke, Ondricka and Metz  B1-53102        41   \n",
       "1128          714466                Trantow-Barrows  B1-53102        41   \n",
       "1001          424914                  White-Trantow  B1-53102        41   \n",
       "\n",
       "      unit price  ext price                date  \n",
       "684        26.07    1199.22 2014-06-08 19:33:33  \n",
       "792        41.19    1853.55 2014-07-04 21:42:22  \n",
       "176        89.22    4014.90 2014-02-11 04:14:09  \n",
       "1213       55.05    2257.05 2014-10-18 19:27:01  \n",
       "1215       93.70    3841.70 2014-10-18 23:25:10  \n",
       "1128       55.68    2282.88 2014-09-27 10:42:48  \n",
       "1001       81.25    3331.25 2014-08-26 11:44:30  "
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[(df['sku'].str.contains('B1-531')) & (df['quantity']>40)].sort_values(by=['quantity','name'],ascending=[0,1])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Bonus Task"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "I frequently find myself trying to get a list of unique items in a long list within Excel. It is a multi-step process to do this in Excel but is fairly simple in pandas. We just use the `unique` function on a column to get the list."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array(['Barton LLC', 'Trantow-Barrows', 'Kulas Inc',\n",
       "       'Kassulke, Ondricka and Metz', 'Jerde-Hilpert', 'Koepp Ltd',\n",
       "       'Fritsch, Russel and Anderson', 'Kiehn-Spinka', 'Keeling LLC',\n",
       "       'Frami, Hills and Schmidt', 'Stokes LLC', 'Kuhn-Gusikowski',\n",
       "       'Herman LLC', 'White-Trantow', 'Sanford and Sons', 'Pollich LLC',\n",
       "       'Will LLC', 'Cronin, Oberbrunner and Spencer',\n",
       "       'Halvorson, Crona and Champlin', 'Purdy-Kunde'], dtype=object)"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[\"name\"].unique()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If we wanted to include the account number, we could use `drop_duplicates`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>account number</th>\n",
       "      <th>name</th>\n",
       "      <th>sku</th>\n",
       "      <th>quantity</th>\n",
       "      <th>unit price</th>\n",
       "      <th>ext price</th>\n",
       "      <th>date</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>740150</td>\n",
       "      <td>Barton LLC</td>\n",
       "      <td>B1-20000</td>\n",
       "      <td>39</td>\n",
       "      <td>86.69</td>\n",
       "      <td>3380.91</td>\n",
       "      <td>2014-01-01 07:21:51</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>714466</td>\n",
       "      <td>Trantow-Barrows</td>\n",
       "      <td>S2-77896</td>\n",
       "      <td>-1</td>\n",
       "      <td>63.16</td>\n",
       "      <td>-63.16</td>\n",
       "      <td>2014-01-01 10:00:47</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>218895</td>\n",
       "      <td>Kulas Inc</td>\n",
       "      <td>B1-69924</td>\n",
       "      <td>23</td>\n",
       "      <td>90.70</td>\n",
       "      <td>2086.10</td>\n",
       "      <td>2014-01-01 13:24:58</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>307599</td>\n",
       "      <td>Kassulke, Ondricka and Metz</td>\n",
       "      <td>S1-65481</td>\n",
       "      <td>41</td>\n",
       "      <td>21.05</td>\n",
       "      <td>863.05</td>\n",
       "      <td>2014-01-01 15:05:22</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>412290</td>\n",
       "      <td>Jerde-Hilpert</td>\n",
       "      <td>S2-34077</td>\n",
       "      <td>6</td>\n",
       "      <td>83.21</td>\n",
       "      <td>499.26</td>\n",
       "      <td>2014-01-01 23:26:55</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   account number                         name       sku  quantity  \\\n",
       "0          740150                   Barton LLC  B1-20000        39   \n",
       "1          714466              Trantow-Barrows  S2-77896        -1   \n",
       "2          218895                    Kulas Inc  B1-69924        23   \n",
       "3          307599  Kassulke, Ondricka and Metz  S1-65481        41   \n",
       "4          412290                Jerde-Hilpert  S2-34077         6   \n",
       "\n",
       "   unit price  ext price                date  \n",
       "0       86.69    3380.91 2014-01-01 07:21:51  \n",
       "1       63.16     -63.16 2014-01-01 10:00:47  \n",
       "2       90.70    2086.10 2014-01-01 13:24:58  \n",
       "3       21.05     863.05 2014-01-01 15:05:22  \n",
       "4       83.21     499.26 2014-01-01 23:26:55  "
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.drop_duplicates(subset=[\"account number\",\"name\"]).head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We are obviously pulling in more data than we need and getting some non-useful information, so select only the first and second columns using `ix`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>account number</th>\n",
       "      <th>name</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>740150</td>\n",
       "      <td>Barton LLC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>714466</td>\n",
       "      <td>Trantow-Barrows</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>218895</td>\n",
       "      <td>Kulas Inc</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>307599</td>\n",
       "      <td>Kassulke, Ondricka and Metz</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>412290</td>\n",
       "      <td>Jerde-Hilpert</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>729833</td>\n",
       "      <td>Koepp Ltd</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>737550</td>\n",
       "      <td>Fritsch, Russel and Anderson</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>146832</td>\n",
       "      <td>Kiehn-Spinka</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>688981</td>\n",
       "      <td>Keeling LLC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>786968</td>\n",
       "      <td>Frami, Hills and Schmidt</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>239344</td>\n",
       "      <td>Stokes LLC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>672390</td>\n",
       "      <td>Kuhn-Gusikowski</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>141962</td>\n",
       "      <td>Herman LLC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>424914</td>\n",
       "      <td>White-Trantow</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>527099</td>\n",
       "      <td>Sanford and Sons</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>30</th>\n",
       "      <td>642753</td>\n",
       "      <td>Pollich LLC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37</th>\n",
       "      <td>383080</td>\n",
       "      <td>Will LLC</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>51</th>\n",
       "      <td>257198</td>\n",
       "      <td>Cronin, Oberbrunner and Spencer</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>67</th>\n",
       "      <td>604255</td>\n",
       "      <td>Halvorson, Crona and Champlin</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>106</th>\n",
       "      <td>163416</td>\n",
       "      <td>Purdy-Kunde</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     account number                             name\n",
       "0            740150                       Barton LLC\n",
       "1            714466                  Trantow-Barrows\n",
       "2            218895                        Kulas Inc\n",
       "3            307599      Kassulke, Ondricka and Metz\n",
       "4            412290                    Jerde-Hilpert\n",
       "7            729833                        Koepp Ltd\n",
       "9            737550     Fritsch, Russel and Anderson\n",
       "10           146832                     Kiehn-Spinka\n",
       "11           688981                      Keeling LLC\n",
       "12           786968         Frami, Hills and Schmidt\n",
       "15           239344                       Stokes LLC\n",
       "16           672390                  Kuhn-Gusikowski\n",
       "18           141962                       Herman LLC\n",
       "20           424914                    White-Trantow\n",
       "21           527099                 Sanford and Sons\n",
       "30           642753                      Pollich LLC\n",
       "37           383080                         Will LLC\n",
       "51           257198  Cronin, Oberbrunner and Spencer\n",
       "67           604255    Halvorson, Crona and Champlin\n",
       "106          163416                      Purdy-Kunde"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.drop_duplicates(subset=[\"account number\",\"name\"]).iloc[:,[0,1]]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "I hope you found this useful. I encourage you to try and apply these ideas to some of your own repetitive Excel tasks and streamline your work flow."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [default]",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
